Genome-Wide Analysis of Tandem Repeats in Plants and Green Algae
نویسندگان
چکیده
Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among the 31 species, no significant correlation was detected between the TR density and genome size. Interestingly, green alga Chlamydomonas reinhardtii (42,059 bp/Mbp) and castor bean Ricinus communis (55,454 bp/Mbp) showed much higher TR densities than all other species (13,209 bp/Mbp on average). In the 29 land plants, including 22 dicots, 5 monocots, and 2 bryophytes, 5'-UTR and upstream intergenic 200-nt (UI200) regions had the first and second highest TR densities, whereas in the two green algae (C. reinhardtii and Volvox carteri) the first and second highest densities were found in intron and coding sequence (CDS) regions, respectively. In CDS regions, trinucleotide and hexanucleotide motifs were those most frequently represented in all species. In intron regions, especially in the two green algae, significantly more TRs were detected near the intron-exon junctions. Within intergenic regions in dicots and monocots, more TRs were found near both the 5' and 3' ends of genes. GO annotation in two green algae revealed that the genes with TRs in introns are significantly involved in transcriptional and translational processing. As the first systematic examination of TRs in plant and green algal genomes, our study showed that TRs displayed nonrandom distribution for both intragenic and intergenic regions, suggesting that they have potential roles in transcriptional or translational regulation in plants and green algae.
منابع مشابه
Dynamic Evolution of Telomeric Sequences in the Green Algal Order Chlamydomonadales
Telomeres, which form the protective ends of eukaryotic chromosomes, are a ubiquitous and conserved structure of eukaryotic genomes but the basic structural unit of most telomeres, a repeated minisatellite motif with the general consensus sequence T(n)A(m)G(o), may vary between eukaryotic groups. Previous studies on several species of green algae revealed that this group exhibits at least two t...
متن کاملInferring Ancestral Chloroplast Genomes with Inverted Repeat
Genome evolution is shaped not only by nucleotide substitutions, but also by structural changes including gene and genome duplications, insertions/deletions and gene order rearrangements. Reconstruction of phylogeny based on gene order changes has been limited to cases where equal gene content or few deletions can be assumed. Since conserved duplicated regions are present in many Chloroplast ge...
متن کاملIndependence of color intensity variation in red flesh apples from the number of repeat units in promoter region of the MdMYB10 gene as an allele to MdMYB1 and MdMYBA
MdMYB10 gene expression results in accumulation of anthocyanin in many tissues including flesh of applefruit. The MdMYB1 and MdMYBA genes are close homologues to MdMYB10 gene and both are responsiblefor red color phenotype in apple fruit skin. In the current study, an apple genome sequence draft analysisindicated that these three genes are located in a unique contig. Further a...
متن کاملChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants
Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1-6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine th...
متن کاملMicrosatellite analysis in organelle genomes of Chlorophyta
Simple Sequence Repeats (SSRs) or microsatellites constitute a significant portion of genomes however; their significance in organellar genomes has not been completely understood. The availability of organelle genome sequences allows us to understand the organization of SSRs in their genic and intergenic regions. In the present work, SSRs were identified and categorized in 14 mitochondrial and ...
متن کامل